In [1]:
pip install geopandas
Requirement already satisfied: geopandas in c:\users\swati\anaconda3\lib\site-packages (0.14.4)
Requirement already satisfied: fiona>=1.8.21 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (1.9.6)
Requirement already satisfied: numpy>=1.22 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (1.26.4)
Requirement already satisfied: packaging in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (23.1)
Requirement already satisfied: pandas>=1.4.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (2.1.4)
Requirement already satisfied: pyproj>=3.3.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (3.6.1)
Requirement already satisfied: shapely>=1.8.0 in c:\users\swati\anaconda3\lib\site-packages (from geopandas) (2.0.4)
Requirement already satisfied: attrs>=19.2.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (23.1.0)
Requirement already satisfied: certifi in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (2024.2.2)
Requirement already satisfied: click~=8.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (8.1.7)
Requirement already satisfied: click-plugins>=1.0 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.1.1)
Requirement already satisfied: cligj>=0.5 in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (0.7.2)
Requirement already satisfied: six in c:\users\swati\anaconda3\lib\site-packages (from fiona>=1.8.21->geopandas) (1.16.0)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in c:\users\swati\anaconda3\lib\site-packages (from pandas>=1.4.0->geopandas) (2023.3)
Requirement already satisfied: colorama in c:\users\swati\anaconda3\lib\site-packages (from click~=8.0->fiona>=1.8.21->geopandas) (0.4.6)
Note: you may need to restart the kernel to use updated packages.
In [2]:
pip install geopy
Requirement already satisfied: geopy in c:\users\swati\anaconda3\lib\site-packages (2.4.1)
Requirement already satisfied: geographiclib<3,>=1.52 in c:\users\swati\anaconda3\lib\site-packages (from geopy) (2.0)
Note: you may need to restart the kernel to use updated packages.
In [3]:
pip install folium
Requirement already satisfied: folium in c:\users\swati\anaconda3\lib\site-packages (0.16.0)
Requirement already satisfied: branca>=0.6.0 in c:\users\swati\anaconda3\lib\site-packages (from folium) (0.7.2)
Requirement already satisfied: jinja2>=2.9 in c:\users\swati\anaconda3\lib\site-packages (from folium) (3.1.3)
Requirement already satisfied: numpy in c:\users\swati\anaconda3\lib\site-packages (from folium) (1.26.4)
Requirement already satisfied: requests in c:\users\swati\anaconda3\lib\site-packages (from folium) (2.31.0)
Requirement already satisfied: xyzservices in c:\users\swati\anaconda3\lib\site-packages (from folium) (2022.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\swati\anaconda3\lib\site-packages (from jinja2>=2.9->folium) (2.1.3)
Requirement already satisfied: charset-normalizer<4,>=2 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\swati\anaconda3\lib\site-packages (from requests->folium) (2024.2.2)
Note: you may need to restart the kernel to use updated packages.
In [4]:
import pandas as pd
import geopandas as gpd
import folium
from folium import Choropleth, Circle, Marker
from folium.plugins import HeatMap, MarkerCluster
from geopy.geocoders import Nominatim

Project 1: Analyzing Impact of Toxic gas release in Pennsylvania¶

The following dataset from the US Environmental Protection Agency (EPA) tracks the release of toxic chemical in Philadelphia, Pennsylvania, USA.

In [5]:
releases = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\toxic_release_pennsylvania\toxic_release_pennsylvania\toxic_release_pennsylvania.shp") 
releases.head()
Out[5]:
YEAR CITY COUNTY ST LATITUDE LONGITUDE CHEMICAL UNIT_OF_ME TOTAL_RELE geometry
0 2016 PHILADELPHIA PHILADELPHIA PA 40.005901 -75.072103 FORMIC ACID Pounds 0.160 POINT (2718560.227 256380.179)
1 2016 PHILADELPHIA PHILADELPHIA PA 39.920120 -75.146410 ETHYLENE GLYCOL Pounds 13353.480 POINT (2698674.606 224522.905)
2 2016 PHILADELPHIA PHILADELPHIA PA 40.023880 -75.220450 CERTAIN GLYCOL ETHERS Pounds 104.135 POINT (2676833.394 261701.856)
3 2016 PHILADELPHIA PHILADELPHIA PA 39.913540 -75.198890 LEAD COMPOUNDS Pounds 1730.280 POINT (2684030.004 221697.388)
4 2016 PHILADELPHIA PHILADELPHIA PA 39.913540 -75.198890 BENZENE Pounds 39863.290 POINT (2684030.004 221697.388)

The below dataset consists readings from air quality stations across the city.

In [6]:
stations = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\PhillyHealth_Air_Monitoring_Stations\PhillyHealth_Air_Monitoring_Stations\PhillyHealth_Air_Monitoring_Stations.shp")
stations.head()
Out[6]:
SITE_NAME ADDRESS BLACK_CARB ULTRAFINE_ CO SO2 OZONE NO2 NOY_NO PM10 ... PAMS_VOC TSP_11101 TSP_METALS TSP_LEAD TOXICS_TO1 MET COMMUNITY_ LATITUDE LONGITUDE geometry
0 LAB 1501 East Lycoming Avenue N N Y N Y Y Y N ... Y N Y N y N N 40.008606 -75.097624 POINT (2711384.641 257149.310)
1 ROX Eva and Dearnley Streets N N N N N N N N ... N N Y N Y N N 40.050461 -75.236966 POINT (2671934.290 271248.900)
2 NEA Grant Avenue and Ashton Street N N N N Y N N N ... N N N N N Y N 40.072073 -75.013128 POINT (2734326.638 280980.247)
3 CHS 500 South Broad Street N N N N N N N N ... N N Y N Y N N 39.944510 -75.165442 POINT (2693078.580 233247.101)
4 NEW 2861 Lewis Street N N Y Y Y N Y Y ... N Y N Y N Y N 39.991688 -75.080378 POINT (2716399.773 251134.976)

5 rows × 24 columns

Measuring distance¶

Measuring the distance between the plant that releases the toxic gas and the air quality monitoring stations.

In [7]:
print(stations.crs)
print(releases.crs)
EPSG:2272
EPSG:2272
In [8]:
# Selecting one release incident for analysis
recent_release = releases.iloc[360]
recent_release
Out[8]:
YEAR                                                       2012
CITY                                               PHILADELPHIA
COUNTY                                             PHILADELPHIA
ST                                                           PA
LATITUDE                                               39.91354
LONGITUDE                                             -75.19889
CHEMICAL      SULFURIC ACID (1994 AND AFTER ACID AEROSOLS" O...
UNIT_OF_ME                                               Pounds
TOTAL_RELE                                             365600.0
geometry           POINT (2684030.0040213093 221697.3882902659)
Name: 360, dtype: object
In [9]:
# Measure distance from release to each station
distances = stations.geometry.distance(recent_release.geometry)
distances
Out[9]:
0     44778.509761
1     51006.456589
2     77744.509207
3     14672.170878
4     43753.554393
5      4711.658655
6     23197.430858
7     12072.823097
8     79081.825506
9      3780.623591
10    27577.474903
11    19818.381002
dtype: float64

The mean distance of the release to air quality monitoring station:

In [10]:
print('Mean distance to monitoring stations: {} feet'.format(distances.mean()))
Mean distance to monitoring stations: 33516.28487007786 feet

The closest monitoring station:

In [11]:
print('Closest monitoring station ({} feet):'.format(distances.min()))
print(stations.iloc[distances.idxmin()][["ADDRESS", "LATITUDE", "LONGITUDE"]])
Closest monitoring station (3780.623590556444 feet):
ADDRESS      3100 Penrose Ferry Road
LATITUDE                    39.91279
LONGITUDE                 -75.185448
Name: 9, dtype: object

Creating a buffer¶

Creating a buffer to understand how many toxic gas releasing plants are within 2 miles of any air quality monitoring station.

In [12]:
two_mile_buffer = stations.geometry.buffer(2*5280)
two_mile_buffer.head()
Out[12]:
0    POLYGON ((2721944.641 257149.310, 2721893.792 ...
1    POLYGON ((2682494.290 271248.900, 2682443.441 ...
2    POLYGON ((2744886.638 280980.247, 2744835.789 ...
3    POLYGON ((2703638.580 233247.101, 2703587.731 ...
4    POLYGON ((2726959.773 251134.976, 2726908.924 ...
dtype: geometry
In [13]:
# Create map with release incidents and monitoring stations
m = folium.Map(location=[39.9526,-75.1652], zoom_start=11)
HeatMap(data=releases[['LATITUDE', 'LONGITUDE']], radius=15).add_to(m)
for idx, row in stations.iterrows():
    Marker([row['LATITUDE'], row['LONGITUDE']]).add_to(m)
m    
# Plot each polygon on the map
folium.GeoJson(two_mile_buffer.to_crs(epsg=4326)).add_to(m)
m
Out[13]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Identifying whether release occured within 2 miles of air quality monitoring station:

In [14]:
# Turn group of polygons into single multipolygon
my_union = two_mile_buffer.geometry.unary_union
print('Type:', type(my_union))

# Show the MultiPolygon object
my_union
Type: <class 'shapely.geometry.multipolygon.MultiPolygon'>
Out[14]:
No description has been provided for this image

The release considered in this study does exist within 2 miles of a station.

In [15]:
my_union.contains(releases.iloc[360].geometry)
Out[15]:
True

But not all releases occur within 2 miles:

In [16]:
# The closest station is more than two miles away
my_union.contains(releases.iloc[358].geometry)
Out[16]:
False

Project 2: Hospital Proximity Analysis¶

Analysing how many hospitals are available to respond to crash collision accidents in New York City.

In [17]:
collisions = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\NYPD_Motor_Vehicle_Collisions\NYPD_Motor_Vehicle_Collisions\NYPD_Motor_Vehicle_Collisions.shp")
collisions.head()
Out[17]:
DATE TIME BOROUGH ZIP CODE LATITUDE LONGITUDE LOCATION ON STREET CROSS STRE OFF STREET ... CONTRIBU_2 CONTRIBU_3 CONTRIBU_4 UNIQUE KEY VEHICLE TY VEHICLE _1 VEHICLE _2 VEHICLE _3 VEHICLE _4 geometry
0 07/30/2019 0:00 BRONX 10464 40.841100 -73.784960 (40.8411, -73.78496) None None 121 PILOT STREET ... Unspecified None None 4180045 Sedan Station Wagon/Sport Utility Vehicle Station Wagon/Sport Utility Vehicle None None POINT (1043750.211 245785.815)
1 07/30/2019 0:10 QUEENS 11423 40.710827 -73.770660 (40.710827, -73.77066) JAMAICA AVENUE 188 STREET None ... None None None 4180007 Sedan Sedan None None None POINT (1047831.185 198333.171)
2 07/30/2019 0:25 None None 40.880318 -73.841286 (40.880318, -73.841286) BOSTON ROAD None None ... None None None 4179575 Sedan Station Wagon/Sport Utility Vehicle None None None POINT (1028139.293 260041.178)
3 07/30/2019 0:35 MANHATTAN 10036 40.756744 -73.984590 (40.756744, -73.98459) None None 155 WEST 44 STREET ... None None None 4179544 Box Truck Station Wagon/Sport Utility Vehicle None None None POINT (988519.261 214979.320)
4 07/30/2019 10:00 BROOKLYN 11223 40.600090 -73.965910 (40.60009, -73.96591) AVENUE T OCEAN PARKWAY None ... None None None 4180660 Station Wagon/Sport Utility Vehicle Bike None None None POINT (993716.669 157907.212)

5 rows × 30 columns

Collision hotspots of New York city:

In [18]:
m_1 = folium.Map(location=[40.7, -74], zoom_start=11) 

HeatMap(data=collisions[['LATITUDE','LONGITUDE']],radius=10).add_to(m_1)

m_1
Out[18]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Understanding Hospital Coverage¶

The below dataset consists of all the hospitals in New York City:

In [19]:
hospitals = gpd.read_file(r"C:\Users\swati\Desktop\DATA ANALYSIS\Introduction to Geospatial analysis\nyu_2451_34494\nyu_2451_34494\nyu_2451_34494.shp")
hospitals.head()
Out[19]:
id name address zip factype facname capacity capname bcode xcoord ycoord latitude longitude geometry
0 317000001H1178 BRONX-LEBANON HOSPITAL CENTER - CONCOURSE DIVI... 1650 Grand Concourse 10457 3102 Hospital 415 Beds 36005 1008872.0 246596.0 40.843490 -73.911010 POINT (1008872.000 246596.000)
1 317000001H1164 BRONX-LEBANON HOSPITAL CENTER - FULTON DIVISION 1276 Fulton Ave 10456 3102 Hospital 164 Beds 36005 1011044.0 242204.0 40.831429 -73.903178 POINT (1011044.000 242204.000)
2 317000011H1175 CALVARY HOSPITAL INC 1740-70 Eastchester Rd 10461 3102 Hospital 225 Beds 36005 1027505.0 248287.0 40.848060 -73.843656 POINT (1027505.000 248287.000)
3 317000002H1165 JACOBI MEDICAL CENTER 1400 Pelham Pkwy 10461 3102 Hospital 457 Beds 36005 1027042.0 251065.0 40.855687 -73.845311 POINT (1027042.000 251065.000)
4 317000008H1172 LINCOLN MEDICAL & MENTAL HEALTH CENTER 234 E 149 St 10451 3102 Hospital 362 Beds 36005 1005154.0 236853.0 40.816758 -73.924478 POINT (1005154.000 236853.000)

Percentage of collisions more than 10km away from closest hospital¶

In [20]:
m_2 = folium.Map(location=[40.7, -74], zoom_start=11) 

# Your code here: Visualize the hospital locations
for idx,row in hospitals.iterrows():
    Marker([row['latitude'],row['longitude']],popup=row['name']).add_to(m_1)

m_1
Out[20]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [21]:
coverage = gpd.GeoDataFrame(geometry=hospitals.geometry).buffer(10000)
my_union = coverage.geometry.unary_union
outside_range = collisions.loc[~collisions["geometry"].apply(lambda x: my_union.contains(x))]
In [22]:
percentage = round(100*len(outside_range)/len(collisions), 2)
print("Percentage of collisions more than 10 km away from the closest hospital: {}%".format(percentage))
Percentage of collisions more than 10 km away from the closest hospital: 15.12%

Identifying the best potential hospital choice for collisions that occur in distant location:

In [23]:
def best_hospital(collision_location):
    idx_min = hospitals.geometry.distance(collision_location).idxmin()
    my_hospital = hospitals.iloc[idx_min]
    name = my_hospital["name"]
    return name

print(best_hospital(outside_range.geometry.iloc[0]))
CALVARY HOSPITAL INC

Identifying hospital that have the highest demand of distant collision cases:

In [24]:
highest_demand = outside_range.geometry.apply(best_hospital).value_counts().idxmax()
In [25]:
highest_demand
Out[25]:
'JAMAICA HOSPITAL MEDICAL CENTER'